skip to main content


Search for: All records

Creators/Authors contains: "Wu, Hau-tieng"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {\sigma ^2} ) $ is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $\sigma $, and a common practice called self-tuned kernel adaptively sets a $\sigma _i$ at each point $x_i$ by the $k$-nearest neighbor (kNN) distance. When $x_i$s are sampled from a $d$-dimensional manifold embedded in a possibly high-dimensional space, unlike with fixed-bandwidth kernels, theoretical results of graph Laplacian convergence with self-tuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $L_N$ to manifold (weighted-)Laplacian for a new family of kNN self-tuned kernels $W^{(\alpha )}_{ij} = k_0( \frac{ \| x_i - x_j \|^2}{ \epsilon \hat{\rho }(x_i) \hat{\rho }(x_j)})/\hat{\rho }(x_i)^\alpha \hat{\rho }(x_j)^\alpha $, where $\hat{\rho }$ is the estimated bandwidth function by kNN and the limiting operator is also parametrized by $\alpha $. When $\alpha = 1$, the limiting operator is the weighted manifold Laplacian $\varDelta _p$. Specifically, we prove the point-wise convergence of $L_N f $ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $\hat{\rho }$ which bounds the relative estimation error $|\hat{\rho } - \bar{\rho }|/\bar{\rho }$ uniformly with high probability, where $\bar{\rho } = p^{-1/d}$ and $p$ is the data density function. Our theoretical results reveal the advantage of the self-tuned kernel over the fixed-bandwidth kernel via smaller variance error in low-density regions. In the algorithm, no prior knowledge of $d$ or data density is needed. The theoretical results are supported by numerical experiments on simulated data and hand-written digit image data. 
    more » « less
  2. null (Ed.)
  3. Abstract

    The $p$-Laplacian has attracted more and more attention in data analysis disciplines in the past decade. However, there is still a knowledge gap about its behavior, which limits its practical application. In this paper, we are interested in its iterative behavior in domains contained in two-dimensional Euclidean space. Given a connected set $\varOmega _0 \subset \mathbb{R}^2$, define a sequence of sets $(\varOmega _n)_{n=0}^{\infty }$ where $\varOmega _{n+1}$ is the subset of $\varOmega _n$ where the first eigenfunction of the (properly normalized) Neumann $p$-Laplacian $ -\varDelta ^{(p)} \phi = \lambda _1 |\phi |^{p-2} \phi $ is positive (or negative). For $p=1$, this is also referred to as the ratio cut of the domain. We conjecture that these sets converge to the set of rectangles with eccentricity bounded by 2 in the Gromov–Hausdorff distance as long as they have a certain distance to the boundary $\partial \varOmega _0$. We establish some aspects of this conjecture for $p=1$ where we prove that (1) the 1-Laplacian spectral cut of domains sufficiently close to rectangles is a circular arc that is closer to flat than the original domain (leading eventually to quadrilaterals) and (2) quadrilaterals close to a rectangle of aspect ratio $2$ stay close to quadrilaterals and move closer to rectangles in a suitable metric. We also discuss some numerical aspects and pose many open questions.

     
    more » « less
  4. Summary

    Periodicity and trend are features describing an observed sequence, and extracting these features is an important issue in many scientific fields. However, it is not an easy task for existing methods to analyse simultaneously the trend and dynamics of the periodicity such as time varying frequency and amplitude, and the adaptivity of the analysis to such dynamics and robustness to heteroscedastic dependent errors are not guaranteed. These tasks become even more challenging when there are multiple periodic components. We propose a non-parametric model to describe the dynamics of multicomponent periodicity and investigate the recently developed synchro-squeezing transform in extracting these features in the presence of a trend and heteroscedastic dependent errors. The identifiability problem of the non-parametric periodicity model is studied, and the adaptivity and robustness properties of the synchro-squeezing transform are theoretically justified in both discrete and continuous time settings. Consequently we have a new technique for decoupling the trend, periodicity and heteroscedastic, dependent error process in a general non-parametric set-up. Results of a series of simulations are provided, and the incidence time series of varicella and herpes zoster in Taiwan and respiratory signals observed from a sleep study are analysed.

     
    more » « less